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ABSTRACT 



Eta-Squared (ES) is often used as a measure of strength of 
association of an effect, a measure often associated with effect size. It is 
also considered the proportion of total variance accounted for by an 
independent variable. It is simple to compute and interpret. However, it has 
one critical weakness cited by several authors (C. Huberty, 1994; P. Snyder 
and S. Lawson, 1993; and T. Snijders, 1996), and that is a sampling bias that 
leads to an inflated judgment of true effect. The purpose of this study was 
to determine the degree of inflation by determining how large ES is likely to 
be by chance, find methods of predicting the mean inflation, and then 
proposing the use of a corrected ES coefficient that is the observed ES minus 
the mean expected ES, a value added approach. A Monte Carlo study was set up 
using a number of samples from 2 to 10 and sample sizes from 5 to 100 in 
steps of 5. In each number of samples and sample size configuration, 10,000 
one-way analysis of variance replications, using samples drawn from the unit 
normal distribution, were conducted for a total of 1,800,000 replications. 
Patterns of observed ES values were examined for influences of number and 
size samples. It was clear that ES was influenced by both of these factors. 
Trend analysis was conducted to determine equations that could be used to 
predict the mean chance-based ES for given number and size of samples. In a 
given research situation, the expected ES coefficient may be determined for 
comparison with the observed ES. Such an approach removes the bias cited as 
the major weakness of the use of ES as a measure of strength of association 
and makes it a more useful measure of non-chance influence. (Contains 6 
figures, 3 tables, and 43 references.) ( Author /SLD) 
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Abstract 



Eta-Squared (ES) is often used as a measure of strength of association of an 
effect, a measure often associated with effect size. It is also considered the proportion of 
total variance accounted for by an independent variable. It is simple to compute and 
interpret. However, it has one critical weakness cited by several authors (Huberty, 
Snyder & Lawson, and Snijders) and that is a sampling bias that leads to an inflated 
judgment of true effect. The pvupose of this research is to determine the degree of 
inflation by determining how large ES is likely to be by chance, find methods of 
predicting the mean inflation, and then proposing the use of a corrected ES coefficient 
which is the observed ES minus the mean expected ES, a value added approach. 

A Monte Carlo study was set up using number of samples fi’om 2 to 10 and 
sample sizes fi’om 5 to 100 in steps of 5. In each number of samples and sample size 
configuration, 10000 one-way ANOVA replications, using samples drawn fi’om the unit 
normal distribution, were conducted for a total of 1,800,000 replications. 

Patterns of observed ES values were examined for influences of number and size 
of samples. It was clear that ES was influenced by both of these factors. Trend analysis 
was conducted to determine equations that could be used to predict the mean chance- 
based ES for given number and size of samples. In a given research situation, the 
expected ES coefficient may be determined for comparison with the observed ES. Such 
an approach removes the bias cited as the major weakness of the use of Eta-squared as a 
measure of strength of association and makes it a more useful measure of non-chance 
influence. 



The Corrected Eta-Squared Coefficient: 

A Value Added Approach 

Eta-Squared (ES) is probably the most used measure of effect size in conjunction 
with ANOVA. It is a measure of the strength of association of an effect, a measure often 
associated with effect size. It is also considered the proportion of total variance 
accounted for by an independent variable. It is simple to compute and interpret. 

However, it has one critical weakness cited by several authors (Huberty, 1994; Snyder & 
Lawson, 1993; Snijders, 1996) and that is a sampling bias that leads to an inflated 
judgment of true effect. The purpose of this research is to determine the degree of 
inflation by determining how large ES is likely to be by chance, find methods of 
predicting the mean inflation, and then proposing the use of a corrected ES coefficient 
which is the observed ES minus the mean expected ES, a value added approach. 

Background 

The concept of effect size has been around for many years. Cohen (1969) is 
generally credited with coining the term. However, the development of meta-analysis by 
Glass, Rosenthal and others in the 1970s (e.g.. Glass, 1976; 1978; Glass & Hakstian, 

1969; Rosenthal, 1976, 1978) and the popularity of a book on meta-analysis in 1981 
(Glass, McGaw, & Smith) are the catalysts for the interest in the concept. Numerous 
publications followed on applications of effect size methodology (e.g.. Lynch, 1987; 
McLean, 1983), methods for estimating effect size and its properties (e.g., Fowler, 1988; 
Gibbons, Hedeker, & Davis, 1993;Hedges, 1981, 1984; Huynh, 1989; Kraemer, 1983; 
Reichhardt & Gollob, 1987; Thomas, 1986), extracting effect size estimates from existing 
studies (e.g.. Hedges, 1982; Snyder & Lawson, 1993), and correcting effect size estimates 
(Snyder & Lawson, 1993). Another book by Wolf (1986) presented a general 
methodology for conducting meta-analysis including the extraction and testing of effect 
sizes. 

Perhaps no one has had a greater impact on the use of effect sizes than Cohen 
(1988) through his books on power analysis. In these books, Cohen suggests general 
guidelines for levels of effect size. These are .2 for small effect, .5 for medium effect, 
and .8 for large effect. However, even Cohen was concerned about proposing these as 
standards. He stated; 

The terms “small,” “medium,” and “large” are relative, not only to each 
other, but to the area of behavioral science or even more particularly to the 
specific content and research method being employed in any given 
investigation. In the face of this relativity, there is a certain risk inherent 
in offering conventional operational definitions for these terms for use in 
power analysis in as diverse a field of inquiry as behavioral science. This 
risk is nevertheless accepted in the belief that more is to be gained than 
lost by supplying a common conventional frame of reference which is 
recommended for use only when no better basis for estimating the ES 
index is available. (1988, p. 25) 
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Cohen's concerns were cited by Wolf (1986) and suggests that effect sizes should 
be interpreted in context. Specifically, one possibility is to compare a given effect size to 
the median effect size of studies extracted fi'om the professional literature in that specific 
context rather than use some arbitrary guideline. Wolf indicates that a .5 standard 
deviation improvement is often considered practically significant and that the general 
guidelines of the National Institute of Education's Joint Dissemination Review Panel 
require .33 effect size, but at times will accept .25 to establish educational significance. 

A broader debate on the use of statistical significance testing emerged fi'om 
Cohen's power analysis books and other works. Kaufinan (1998) indicates that the 
"controversy about the use or misuse of statistical significance testing has been evident in 
the literature for the past 10 years and has become the major methodological issue of our 
generation" (p. 1). The debate has spawned at least two special issues of journals 
(Research in the Schools, McLean & Kaufinan, 1998; Journal of Experimental 
Education, Thompson, 1993) and dozens of other articles. The editorial policies of 
journals have been changed by the debate (e.g., APA, 1994; Schafer, 1990, 1991; 
Thompson, 1996, 1997). 

The debate has ranged fi'om those who recommend the elimination of statistical 
significance testing (e.g.. Carver, 1978, 1993; Nix & Barnette, 1998) to those who 
staunchly support it (e.g., Frick, 1996; Levin, 1993, 1998; McLean & Ernest, 1998). 
However, even those who defend statistical significance testing indicate that significant 
results should be accompanied by a measure of practical significance. The leading 
method of reporting practical significance is through the provision of an effect size 
estimate (Kirk, 1996; McLean & Ernest, 1998; Robinson & Levin, 1997; Thompson, 
1996). Unfortunately, the criteria forjudging the practical significance of results based 
on effect size has defaulted to the use of Cohen's (1988) guidelines that even Cohen has 
warned us about (1988, 1990). As Wolf (1986) noted, empirical standards forjudging 
effect size are needed. 

While other studies have suggested that reasonably large effect sizes might occur 
by chance (Barnette & McLean, 1999, November), no other studies could be found that 
used the relationship between know factors (such as sample size and number of groups) 
and effect size to predict effect size. If such a relationship can be verified, it would help 
researchers avoid the over-interpretation of effect sizes. 

Methods 

A Monte Carlo study was set up using number of samples fi'om 2 to 10 and 
sample sizes fi'om 5 to 100 in steps of 5. In each number of samples and sample size • 
configuration, 10,000 one-way ANOVA replications were generated, using samples 
drawn fi'om the unit normal distribution, were conducted for a total of 1,800,000 
replications. Data were generated using a program written in double-precision Quick- 
BASIC. Analysis of the raw data was conducted using several routines of SAS®. The 
accuracy of this approach has been established in several other studies (e.g., Barnette & 
McLean, 1999, November). 
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Patterns of observed ES values were examined for their relationships with number 
and size of samples. Using these relationships, a regression equation was developed to 
predict effect size from number of subject per group and number of groups. Tables and 
figures were developed to show the results. 



Results 



First, the accuracy of the Monte Carlo procedures used can be seen by inspecting 
Table 1. Table 1 shows the obtained p-values for each of the preset alpha- values for the 
1 .8 million replications. It was clear that ES was infiuenced by both number and size of 
samples. A regression-based trend analysis was conducted to determine equations that ‘ 
could be used to predict the mean chance-based ES for given number and size of samples. 
It was determined that a power-type fimction of the form a n was the best fit of the 
observed data. The regression equation produced values that were virtually 1 . 

Keeping in mind that all of the data were produced with the means being equal for all 
groups in each model, the mean eta-squared values for each sample size/number of 
groups combinations are shown in Table 2. Scanning across the rows and down the 
columns illustrates the trends. 

Table 3 shows the eta-squared values as a power-type fimction of the sample size 
for each number of groups. The equations for determining these values is also shown. 
The results are even clearer when depicted as graphs. Figures 1-6 show the results for 2, 
3, 5, 8, and 10 groups respectively. In each case the near-perfect fit of the regression 
lines is evident. 



Here are a few examples of how this could be used: 



Situation 1 : K= 2, n= 22 

Observed Eta-squared= .1876 
Predicted Eta-squared= .0235 

Proportion of variance accoimted for by treatment above what would be 
expected by chance (the value added)= .1641 



Situation 2: K= 5, n= 50 

Observed Eta-squared= .2215 
Predicted Eta-squared= .0161 

Proportion of variance accoimted for by treatment above what would be 
expected by chance (the value added)= .2054 



Situation 3: K= 8, n= 7 

Observed Eta-squared= .1134 
Predicted Eta-squared= .1268 

Proportion of variance accounted for by treatment above what would be 
expected by chance (the value added)= 0 
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Discussion and Recommendations 



It is obvious that one can use these results to estimate the eta-squared that might' 
be expected by chance. In a given situation, subtracting the predicted chance eta-squared 
from the eta-squared obtained in an experiment would give the proportion of variance 
that could be attributed to the treatment beyond what would be expected by chance. 

Such an approach would remove the bias cited as the major weakness of the use of eta- 
squared as a measure of strength of association and make it a more useful measure of 
non-chance. 

We recommend that these results be replicated and if proved to be valid, the use 
of the corrected eta-squared coefficient become common practice. At the very least, 
when an eta-squared value is cited, the chance eta-squared is presented for comparison. 
One limitation of this research is that equal sample sizes were used. For this procedure to 
have maximum utility, predicting the chance eta-squared when unequal samples sizes are 
used is needed. 
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Table 1. Summary Statistics for Monte Carlo Replications, n= 1,800,000 



Mean Probability of F 


.500488 


Observed p for a = .25 


.249857 


Observed p for a = .10 


.100076 


Observed p for a = .05 


.050373 


Observed p for a = .01 


.010158 


Observed p for a = .001 


.001007 


Observed p for a = .0001 


.000099 
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Table 2. Eta-Squared by Number of Samples and Sample Size 



n 


K=2 


K=3 


K=4 


K=5 


K=6 


K=7 


00 


K=9 


K= 10 


Total 


5 


.112864 


.143262 


.157646 


.167354 


.172747 


.176884 


.178897 


.182443 


.183633 


.163970 


10 


.051924 


.068719 


.076761 


.082179 


.084605 


.086731 


.088563 


.090047 


.090855 


.080043 


15 


.033855 


.045719 


.050117 


.053768 


.055857 


.057598 


.058721 


.059857 


.0605 i 7 


.052890 


20 


.024876 


.034059 


.037886 


.040475 


.041863 


.043147 


.044052 


.044762 


.045354 


.039608 


25 


.020277 


.027106 


.030216 


.032373 


.033936 


.034588 


.035044 


.035775 


.036050 


.031707 


30 


.017220 


.022748 


.025163 


.026851 


.027680 


.028713 


.029539 


.029726 


.030025 


.026407 


35 


.014539 


.019534 


.021628 


.023054 


.023852 


.024458 


.025089 


.025386 


.025776 


.022591 


40 


.012597 


.016680 


.018761 


.019931 


.021085 


.021516 


.021839 


.022326 


.022506 


.019693 


45 


.011259 


.014940 


.016644 


.017896 


.018672 


.019169 


.019478 


.019844 


.020047 


.017550 


50 


.010230 


.013601 


.015142 


.016069 


.016808 


.017151 


.017597 


.017829 


.017946 


.015819 


55 


.009067 


.012254 


.013549 


.014549 


.015161 


.015632 


.016005 


.016162 


.016474 


.014317 


60 


.008367 


.011189 


.012508 


.013288 


.013824 


.014338 


.014509 


.014794 


.014970 


.013087 


65 


.007597 


.010385 


.011587 


.012350 


.012860 


.013192 


.013479 


.013708 


.013882 


.012115 


70 


.007272 


.009617 


.010718 


.011476 


.011887 


.012270 


.012467 


.012750 


.012824 


.011253 


75 


.006640 


.008939 


.010020 


.010776 


.011119 


.011510 


.011668 


.011908 


.011919 


.010500 


80 


.006321 


.008358 


.009379 


.010032 


.010410 


.010715 


.010963 


.011118 


.011276 


.009841 


85 


.005962 


.007815 


.008793 


.009437 


.009828 


.010106 


.010349 


.010474 


.010548 


.009257 


90 


.005557 


.007408 


.008306 


.008935 


.009261 


.009531 


.009707 


.009886 


.010006 


.008733 


95 


.005272 


.007063 


.007901 


.008372 


.008747 


.009115 


.009237 


.009326 


.009502 


.008282 


100 


.005021 


.006639 


.007466 


.008041 


.008324 


.008621 


.008755 


.008884 


.009039 


.007866 


Total 


.018836 


.024802 


.027510 


.029360 


.030426 


.031249 


.031798 


.032350 


.032658 


.028777 
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Table 3. Eta-Squared as a Function of Sample Size for Number of Groups, 
Eta-Squared = a n 



K 


a 


b 




2 


0.557324 


1.024959 


.999481 


3 


0.725375 


1.018713 


.999916 


4 


0.790802 


1.012771 


.999932 


5 


0.840826 


1.011329 


.999940 


6 


0.866752 


1.009180 


.999949 


7 


0.882383 


1.006228 


.999965 


8 


0.897923 


1.005868 


.999977 


9 


0.916229 


1.06973 


.999987 


10 


0.921270 


1.005538 


.999979 



Coefficients a and b as function of K 

a = 0.001677 - 0.038036 + 0.293553 K +0. 120054 (R^= .991440) 

b = -0.000046 + 0.00255 - 0.011803 K+ 1.043852 (R^= .987739) 
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Sample Size, 



Eta-Squared as Function of Sample Size for K= 5 Groups ^ _ q 8408x"' ° 

with Power Function Regression Line = o 9999 
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Eta-Squared as Function of Sample Size for K= 3 Groups y = o. 7254 x'^ ° 

with Power Function Regression Line r 2 _ 0.9999 
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Eta-Squared as Function of Sample Size for K= 2 Groups y = o.5573x'^ 

with Power Function Regression Line r2 = 0.9995 
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